Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies
نویسندگان
چکیده
We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario. Two agents learn concurrently by interacting with each other without any need for simulated users (SUs) to train against or corpora to learn from. In particular, we compare the Qlearning, Policy Hill-Climbing (PHC) and Win or Learn Fast Policy Hill-Climbing (PHC-WoLF) algorithms, varying the scenario complexity (state space size), the number of training episodes, the learning rate, and the exploration rate. Our results show that generally Q-learning fails to converge whereas PHC and PHC-WoLF always converge and perform similarly. We also show that very high gradually decreasing exploration rates are required for convergence. We conclude that multiagent RL of dialogue policies is a promising alternative to using single-agent RL and SUs or learning directly from corpora.
منابع مشابه
Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملImproving Agent Performance for Multi-Resource Negotiation Using Learning Automata and Case-Based Reasoning
In electronic commerce markets, agents often should acquire multiple resources to fulfil a high-level task. In order to attain such resources they need to compete with each other. In multi-agent environments, in which competition is involved, negotiation would be an interaction between agents in order to reach an agreement on resource allocation and to be coordinated with each other. In recent ...
متن کاملReinforcement Learning of Multi-Issue Negotiation Dialogue Policies
We use reinforcement learning (RL) to learn a multi-issue negotiation dialogue policy. For training and evaluation, we build a hand-crafted agenda-based policy, which serves as the negotiation partner of the RL policy. Both the agendabased and the RL policies are designed to work for a large variety of negotiation settings, and perform well against negotiation partners whose behavior has not be...
متن کاملStrategic Foresighted Learning in Competitive Multi-Agent Games
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-agent games. We make the observation that in a competitive setting with adaptive agents an agent’s actions will (likely) result in changes in the opponents policies. In addition to accounting for the estimated policies of the opponents, our algorithm also adjusts these future opponent policies by...
متن کاملReinforcement Learning of Argumentation Dialogue Policies in Negotiation
We build dialogue system policies for negotiation, and in particular for argumentation. These dialogue policies are designed for negotiation against users of different cultural norms (individualists, collectivists, and altruists). In order to learn these policies we build simulated users (SUs), i.e. models that simulate the behavior of real users, and use Reinforcement Learning (RL). The SUs ar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014